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Figure 1: Visualization of results. The highlighted vertical 
column corresponds to document ranked 14. The title of 
document ranked 14 document will also be highlighted in 
the title display window. Clicking the highlighted vertical 
column brings up the full text of that document. 

played. For example, we can expect greater accuracy with 
a first stage display that shows document titles, authors 
and subject keywords compared to one that shows just the 
document titles. When this additional document content is 
displayed in textual form, the increased accuracy may how- 
ever bring along a negative effect on perusal time (increase 
in perusal time). This is because more time is consumed 
perusing the additional content. 

A possible means to addressing this problem of display-: 
ing more information in the first stage without increasing 
perusal effort and perusal time is to display information in 
some form that does not require as much perusal time and 
screen space as text. Graphical displays (viBuaiizations) of 
the characteristics of documents which are significant in sup- 
porting the decision to peruse or not, could enable set-at-a- 
time perusal of documents, rather than document-at-a-time 
perusal of text displays. 

In the remainder of this paper, we describe a visualiza- 
tion tool meant to address this issue; describe and present 
the results of an experiment evaluating the tod; and draw 
some conclusions about its effectiveness as a first stage dis- 
play. 

2 Visualization tool 

The visualization tool is an add-on to a bask interface for 
an IR system. There is a query window. The titles and 
ranks of retrieved documents (first stage of display) is shown 
below the query window. Figure 1 shows the visualization 
tool corresponding to the query "How has affirmative- action 
affected the construction-industry, construction projects and 
public works". 

The visualization consists of a series of vertical columns 




of bars. There is one column of bars for each document. 
The left-most vertical column corresponds to the document 
ranked 1 and the right-most vertical column corresponds to 
the document ranked 150. In each vertical column there are 
multiple bars - one each for each query word. The height 
of the bar at the intersection of a query-word-row and a 
document-column corresponds to the weight of that query 
word in that document. Moving the mouse cursor over the 
vertical columns highlights the column directly beneath the 
mouse cursor and simultaneously highlights the title cor- 
responding to that document in the title-display window. 
The visualization window is scrollable, in case the number 
of query words exceeds the available vertical space. The 
words in the visualization are also stopped and stemmed. 
Thus the combination of the visualization tool and the ti- 
tle display forms the first stage of display in our system. 
The basic interface, and the visualization tool utilize the 
INQUERY retrieval engine, version 2.1p3 [CCH92], 

2.1 Response to the need for a concise display of docu- 
ment content 

In the Introduction, we discussed the need for a concise 
first stage display which can also be perused quickly. We 
believe this visualization scheme to qualify for such a first 
stage display. It provides information valuable in deciding 
the relevance of document such as the weight of query con- 
cepts in the retrieved documents. The information is also 
displayed in a highly condensed way, and allows many doc- 
ument surrogates to be perused at one time. Textual dis- 
play of document surrogates force the user to peruse them 
a document-at-a-time. However, with this visualization one 
can infer global patterns such as the following. Suppose 
we are faced with a search topic where a query term 'q' 
is so important that all relevant documents will have that 
query word. We would then ask the following questions: 
To identify relevant documents, we might ask "Which docu- 
ments have the important query word 'q* — To evaluate the 
goodness of the query, we might ask "Does the important 
query word V appear in most of the retrieved documents?* . 
When comparing the contribution of two query words, one 
might ask questions such as "What is the contribution of 
query word q2 compared to q5? n . Answers for such ques- 
tions seem to emerge from the visualization quickly. Such 
global perception of data is not possible with text displays 
that emphasize the parts rather than the whole. We refer 
to this kind of global perception as "set- at-a- time perusal", 
since the information gained is about a set of documents. 

The presence or absence of specific significant words can 
be quickly seen, and it is possible, in one glance, to identify 
sequences of documents which do, or do not have important 
contributions from specific query words. For the example 
search topic ("How has affirmative action affected the con- 
struction industry 1 ?*), there are two facets that are central: 
"affirmative action" and "construction industry*. From the 
visualization tool, we can immediately see that most of the 
documents are concerned with the "construction industry" 
and only a portion of them have the term "affirmative ac- 
tion" . We can also see that the "affirmative action 1 * concept 
is spread sparsely throughout the top 70 documents. The 
graphical format of presentation has some important advan- 
tages in that it is more condensed and can be more easily 
and quickly perused than an equivalent text display. 
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Abstract 

We present the design of a visualization tool that graphically 
displays the strength of query concepts in the retrieved docu- 
ments. Graphically displaying document surrogate informa- 
tion enables set-at-a-time perusal of documents, rather than 
document-at-a-time perusal of textual displays. By pro- 
viding additional relevance information about the retrieved 
documents, the tool aids the user in accurately identifying 
relevant documents. Results of an experiment evaluating 
the tool shows that when users have the tool they are able 
to identify relevant documents in a shorter period of time 
than without the tool, and with increased accuracy. We 
have evidence to believe that appropriately designed graph- 
ical displays can enable users to better interact with the 
system. 

1 Introduction 

The overall concern of all components of an IR system is to 
present the user as much relevant information as possible. 
While there has been a lot of work on effective algorithms 
for retrieving and ranking relevant documents, not much at- 
tention has been paid to study the effectiveness of user inter- 
face components of IR systems. Apart from retrieval mech- 
anisms, interactive IR systems must also be concerned with 
the design of appropriate display mechanisms that present 
the retrieved information in the "best possible manner". We 
discuss what constitutes "best possible" display by examin- 
ing a typical user interaction with an IR system. A typical 
interaction with current IR systems proceeds as follows: 

• User in an Anomalous State of Knowledge [BOB82] 
expresses his information need as a query that is in- 
terpretable by the system. 

• The system matches the query with the stored docu- 
ments and retrieves a set of documents. In the case 
of ranked output systems, the result is ranked in the 
decreasing order of relevance. Boolean systems may 
rank the documents in a chronological order. 

• At the first stage of display! a set of document sur- 
rogates for the retrieved documents are displayed to 

Permission to make digital/bard copies of ail or part of this material for 

personal or classroom use is granted wrthoul fe« provioW thot tbc oopies 

are not made or distributed for prom cr commercial aoWag& 

right notice, the title of the pubticsftkn and iu o^ appear, art 

given that copyright is by permission of ll* ACM, Inc. To copy otherwise, 

to republish, to pott on servers or to redistribute to lists, require* specific 

permission and/or fee 

SIGJR 97 Philadelphia PA, USA 

Copyright 1997 ACM 0-S9791 -836-3/97/7. S3.50 



the user. These surrogates typically consist of a com- 
bination of titles, author, source, date of publication, 
etc. 

• The user inspects the document surrogates and re- 
quests more information (such as the full text if avail- 
able) about those that look relevant. This leads to a 
second stage of display that provides as much informa- 
tion about the document (in many cases, the complete 
document itself) as is available in the system. 

• After going through a sufficient number of documents, 
the user quits the session or reformulates the query to 
retrieve a better set of documents. 

In this scheme, the first stage display of document surrogates 
is meant to provide a concise and accurate indication of 
document content. The second stage display of documents 
provides more information about the document. In cases 
where the document full text may not be available for the 
second stage (such as a typical online library catalog), users 
proceed to a third stage where they examine a paper-copy 
in library bookshelves where the complete document may 
be available. 

Thus as the user progresses from the initial to the later 
stages of display, that which is displayed is more complete 
and informative, allowing increasingly accurate relevance 
judgments. However, since more information is displayed 
about a document in later stages of display, they are also 
more time-consuming to peruse. Furthermore, requesting 
second stage of display may be more costly since some sys- 
tems charge a certain fee to deliver the full text of docu- 
ments. Apart from the human frustration of waiting for the 
delivery of full text, one may have to pay for it monetarily 
since certain systems charge the user based on connect-time 
and the volume of downloaded data. Therefore, it is advan- 
tageous for the searcher to be reasonably certain about the 
relevance of a document before requesting a second stage of 
display. 

For the user to make accurate relevance judgments based 
on the first stage display, the form and content of first stage 
of display should provide good indication of what document 
is about. The form of the first stage display should be such 
that it is quickly perusable - the purpose of the first stage 
display (of providing a quick and concise indication of doc- 
ument content) is lost otherwise. The content of the first 
stage display should be such that users can make accurate 
judgments about document relevance. 

We can expect an improvement in the accuracy of rele- 
vance judgment if more content from the documents are dis- 
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Figure 1: Visualization of results. The highlighted vertical 
column corresponds to document ranked 14. The title of 
document ranked 14 document will also be highlighted in 
the title display window. Clicking the highlighted vertical 
column brings up the full text of that document. 

played. For example, we can expect greater accuracy with 
a first stage display that shows document titles, authors 
and subject keywords compared to one that shows just the 
document titles. When this additional document content is 
displayed in textual form, the increased accuracy may how* 
ever bring along a negative effect on perusal time (increase 
in perusal time). This is because more time is consumed 
perusing the additional content. 

A possible means to addressing this problem of display* 
ing more information in the first stage without increasing 
perusal effort and perusal time is to display information in 
some form that does not require as much perusal time and 
screen space as text. Graphical displays (visualizations) of 
the characteristics of documents which are significant in sup- 
porting the decision to peruse or not, could enable set-at-a- 
time perusal of documents, rather than document-at-a-time 
perusal of text displays. 

In the remainder of this paper, we describe a visualiza- 
tion tool meant to address this issue; describe and present 
the results of an experiment evaluating the tool; and draw 
some conclusions about its effectiveness as a Erst stage dis- 
play. 

2 Visualization tool 

The visualization tool is an add-on to a bask interface for 
an IR system. There is a query window. The titles and 
ranks of retrieved documents (first stage of display) is shown 
below the query window. Figure 1 shows the visualization 
tool corresponding to the query "How has affirmative-action 
affected the construction-industry, construction projects and 
public works" * 

The visualization consists of a series of vertical columns 




of bars. There is one column of bars for each document. 
The left-most vertical column corresponds to the document 
ranked 1 and the right-most vertical column corresponds to 
the document ranked 150. In each vertical column there are 
multiple bars - one each for each query word. The height 
of the bar at the intersection of a query- word- row and a 
document-column corresponds to the weight of that query 
word in that document. Moving the mouse cursor over the 
vertical columns highlights the column directly beneath the 
mouse cursor and simultaneously highlights the title cor- 
responding to that document in the title-display window. 
The visualization window is scrollable, in case the number 
of query words exceeds the available vertical space. The 
words in the visualization are also stopped and stemmed. 
Thus the combination of the visualization tool and the ti- 
tle display forms the first stage of display in our system. 
The basic interface, and the visualization tool utilize the 
INQUERY retrieval engine, version 2.1p3 [CCH92]. 

2.1 Response to the need for a concise display of docu- 
ment content 

In the Introduction! we discussed the need for a concise 
first stage display which can also be perused quickly. We 
believe this visualization scheme to qualify for such a first 
stage display. It provides information valuable in deciding 
the relevance of document such as the weight of query con- 
cepts in the retrieved documents. The information is also 
displayed in a highly condensed way, and allows many doc- 
ument surrogates to be perused at one time. Textual dis- 
play of document surrogates force the user to peruse them 
a document-at-a-time. However, with this visualization one 
can infer global patterns such as the following. Suppose 
we are faced with a search topic where a query term 'q' 
is so important that all relevant documents will have that 
query word. We would then ask the following questions: 
To identify relevant documents, we might ask "Which docu- 
ments have the important query word 1 q*? n . To evaluate the 
goodness of the query, we might ask "Does the important 
query word 'q' appear in most of the retrieved documents?* . 
When comparing the contribution of two query words, one 
might ask questions such as "What is the contribution of 
query word q2 compared to q5?*\ Answers for such ques- 
tions seem to emerge from the visualization quickly. Such 
global perception of data is not possible with text displays 
that emphasize the parts rather than the whole. We refer 
to this land of global perception as u set-at-a-time perusal*, 
since the information gained is about a set of documents. 

The presence or absence of specific significant words can 
be quickly seen, and it is possible, in one glance, to identify 
sequences of documents which do, or do not have important 
contributions from specific query words. For the example 
search topic ("How has affirmative action affected the con- 
struction industry* ?"), there are two facets that are central; 
"affirmative action" and "construction industry" . From the 
visualization tool, we can immediately see that most of the 
documents are concerned with the "construction industry" 
and only a portion of them have the term "affirmative ac- 
tion" . We can also see that the "affirmative action" concept 
1b spread sparsely throughout the top 70 documents. The 
graphical format of presentation has some important advan- 
tages in that it is more condensed and can be more easily 
and quickly perused than an equivalent text display. 
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3 Related work 

A number of visualization schemes for information retrieval 
have been proposed [CRM91, MFH95, Kor91, Spo94, HKW94, 
ACRS93, AB93) But most of these do not address either the 
display of query results or the problem of support of rele- 
vance assessment. An exception is TileBars [Hea95], but 
there are some important ways in which TileBars differs 
from the visualization proposed here. 

• TileBars provide information on how the different query 
facets overlap in different sections of a long document. 
Our visualization scheme does not provide information 
at that fine levels of granularity. 

• To make the best use of such additional information in 
TileBars, the user has to decompose the information 
need into more-or-less orthogonal facets of a query. 
However, in our visualization, the user can type in the 
information need as a free-form textual query. 

• TileBars presents the document surrogates in a list, 
making it more difficult than in our tool to gain an 
overall picture of the query word distribution for a 
whole set of documents in one glance. 

• TileBars seems best suited for long documents, while 
our visualization scheme seems to be equally effective 
for short and long documents. 

There are a handful of studies that have investigated the 
effectiveness of document surrogates as content-indicators to 
enable human relevance judgments [Jan91, Sar69, RRS61, 
Tho73, MKB78]. None of them studied the effectiveness of 
graphical displays (visualizations) of document surrogates 
as content indicators. A result common to all of these stud- 
ies is that "accuracy" in relevance judgments increases with 
increasing information (e.g. Title < Abstract < Full text). 
On the whole, we find that there has been a lack of studies to 
evaluate the effectiveness of graphical displays of document 
surrogates as indicators of relevance. This is mainly due to 
the fact that only recently has it been technologically and 
economically feasible to render such displays in real-time by 
the computer. Our study is an attempt to fill that gap. 

4 Experimental Setup 

In this section, we discuss an experiment to test the effec- 
tiveness of the visualization tool as a first stage display, and 
as a tool to aid effective query reformulation. The part on 
query reformulation will be discussed in a subsequent paper. 
We used a portion of the TREC [Har96] database consist- 
ing of all of diflkl and disk2 except the "Federal Register" 
documents. We did not use the Federal Register documents 
because a high proportion of them did not have a title. We 
used INQUERY 2.1p3 as the search engine [CCH92]. The re- 
trieval mechanism of the search engine is based on bayesian 
inference networks using the word occurrence statistics in 
documents. All of the TREC information topics that we 
used were very detailed in their description of information 
need. We picked ten information topics for this study. The 
criterion used to pick the topics will be discussed below. 

A slightly modified version of the Description field (mainly 
removing the introductory words such as "Document will 
report 1 *) was submitted to the retrieval system. 120 docu- 
ments from the top 150 retrieved documents were obtained 
and split into two groups as follows: High precision group 
consisting of 60 documents ranked 1 through 60 and a low 



precision group consisting of 60 documents ranked 91 through 
150. We controlled for precision 1 as a factor in the ex- 
periment since we felt that precision might impact the pe- 
rusal time; Users might more quickly identify non-relevant 
documents, than the relevant documents. Earlier studies 
[Sar69, RRS61, MKB78] indicate that precision also influ- 
ences the ability to judge non-relevance. 

Each of the two precision groups were further split into 
two groups: documents with odd ranks and the documents 
with even ranks. Thus, there were 4 groups of 30 documents 
for each information topic: High_precision_even_rank8, 
High^yrecision-oddj'anks, Low .precision-even-ranks and 
Low^recision-odoLranks. The criterion used to pick the in- 
formation topics for this study was that the "description" 
field when used as the query statement must retrieve a set 
of documents that had a distinct split in the precision val- 
ues between the high precision group (ranks 1 through 60) 
and the low precision group (ranks 90 through 150). Since 
we did not want any overlap in precision values between 
the high precision group and the low precision group for all 
the ten chosen topics, we discarded the documents ranked 
61 through 90. The precision values in the high precision 
group for all the chosen topics ranged from 0.43 to 0.6 while 
those of the low precision group ranged from 0.03 to 0.23. 

The experiment we describe was aimed at investigating 
the effect of visualization on two problems for users: 

• accurately identifying relevant documents 

• effectively reformulating queries 

In this paper, we report on results relevant to only the first 
of these, but because both problems were addressed in the 
same experimental design, we describe the entire experi- 
ment. 

In the experiment, users were given two different types 
of tasks: 

• Task of judging relevance: The users were given the 
information topic and the search statement used to 
retrieve documents. They were asked to judge the rel- 
evance of each of the 30 documents that were displayed 
to them as one of 

- relevant to the information topic. 

- non-relevant to the information topic. 

- Unsure. 

For the purposes of the current experiment, clicking 
the left mouse-button over a document title in the 
title-display window or over a vertical column in the 
visualization window marks the document as relevant. 
Clicking the right mouse button over the title (or the 
column in the visualization window) marks the doc- 
ument as non-relevant. Middle-clicking it marks the 
document as "Unsure". Also, left-clicking a query 
word in the visualization window marks all documents 
containing that query word as relevant. Right-clicking 
a query word marks all documents that do not contain 
that word as non-relevant. Full text or any other infor- 
mation about the documents was not made available 
to users. 

• Query reformulation task: Here the users were asked 
to "modify the preconstructed query into a form that 
will retrieve more relevant documents* 1 . For half of 



1 Precision is the density of relevant documents 
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the topics, users had the visualisation tool and far the 
other half users did not have the visualization tool - 
making it a wi thin-subjects, between-topics study. 

For the "relevance judgment 14 task, precision (two levels: 
high and low) and visualization (two levels: with or without) 
were controlled in this within-subjects, wi thin-topics study. 
The even ranked document group was shown with the visual- 
ization too] and the odd ranked document group was shown 
without the visualization tool. The users were not told that 
the 4 different document groups had two different precision 
levels. Instead, they were told that the query was issued 
against 4 different databases and the top 30 documents from 
each database was presented to them as 4 separate tasks 
- two with and the other two without visualization. For a 
given topic, the first task was always a "relevance judgment" 
task with a high-precision group. The next task was a query 
reformulation task. The third, fourth and fifth tasks were 
relevance judgment tasks for the other three groups of 30 
documents. The first task was always a relevance judgment 
task because we wanted the users to have a good feel for 
the retrieved set of documents before they embarked on the 
query reformulation task. The first task of relevance judg- 
ment was always done with a high-precision document group 
because, in the real-world the users almost always inspect 
the top-ranked high-precision document range before they 
go down the ranks to inspect the low-precision range. Each 
user did the 5 tasks (4 relevance judgment tasks for the 4 
document groups, and one query reformulation task) for 6 
information topics, and finally did the search reformulation 
task for 4 more topics. The 6 topics for which the users did 
both the relevance judgment and query reformulation were: 

• Topic 77: Document will report a poaching method 
used against a certain type of wildlife. 

• Topic 115: Document will report specific consequence(s) 
of the U.S.'s Immigration Reform and Control Act of 
1986. 

• Topic 134: Document wiU report on the objectives, 
processes, and organization of the human genome project. 

• Topic 136: Document will report on attempts by Pa- 
cific Telesis to diversify beyond its basic business of 
providing local telephone service. 

• Topic 145: Document will describe how, and how ef- 
fectively, the so-called *pro- Israel lobby 1 * operates in 
the United States. 

• Topic 197: Document will discuss legal tort reform (a 
civil wrong for which the injured party seeks a judg- 
ment) with regard to placing limitations on monetary 
compensation to plaintiffs. 

The order in which the six topics were presented were 
balanced across the 37 subjects. The order in which the 
two visualization conditions appeared for a given topic were 
also balanced. Hie order in which the two precision groups 
appeared in a given topic was not balanced due to the con- 
straint that a high precision group is always the first condi- 
tion. 

The human subjects in this experiment were Georgia 
Tech undergraduate students enrolled in a one-credit hour 
class on library searching. Students who participated in the 
study got full scores in two homework assignments. The 
complete experiment was split over two days. Subjects were 
asked to sign a consent form upon arrival. They were then 




given a demo of the system by the experimenter. They then 
had a hands-on tutorial where they practiced both the "rel- 
evance judgment" task and the "query reformulation" task. 
Then, they did the 5 tasks for each of the three informa- 
tion topics marking the end of the experiment for the first 
day. On the second day, they did the 5 tasks for each of the 
other 3 topics, followed by the "query reformulation" task 
for 4 other topics. 

The subjects were given monetary incentive to do well 
in the experiment. They were evaluated as follows: We 
knew a-priori, the relevance of all the documents as given 
by the TREC assessors. For the relevance judgment task, 
for each document the user obtained a +1 point if their rel- 
evance judgment matches the TREC assessor's judgment, a 
•1 point if their judgment does not match, and 0 points if 
they are "Unsure" . The user has to judge all of the 30 dis- 
played documents. Thus, for the 4 groups of 30 documents, 
for the 6 topics, each subject made a total of 4x30x6 = 720 
judgments. 





TREC judgment 
Rel Notjrel 


Rel 

User judgment Not -rel 
Unsure 


RuRt RuNt 
NuRt NuNt 
UuRt UuNt 



The time taken by the subject to complete a task was also 
noted down. The top 10 quickest subjects with the most 
points were given monetary awards as follows: All partici- 
pants were ranked on increasing order of time and decreas- 
ing order of points scored. Each participant's rank on both 
the categories (time and points) were added to get the sum- 
rank. The participant with the lowest sum rank was con- 
sidered the best performer. Hence, to do well, one must be 
both accurate and quick. The top performer was given $50, 
the second and third performers were given $30 each, the 
fourth through sixth performers were given $20 each and 
the seventh through the tenth performers were given $10 
each. The participants were told of the rating scheme, so 
we can assume that they optimized for time and accuracy 
equally. 

Since we claim that graphical display of additional docu- 
ment surrogates does not increase perusal time significantly 
(due to the set- at-a- time perusal of documents), we pre- 
dict that the time taken to complete the task for the vi- 
sualization group will not be significantly higher than the 
non- visualization group. We also predict an increase in ac- 
curacy of relevance judgments for the visualization group, 
because we claim that very pertinent document surrogate 
information (i.e., the weight of query words in the retrieved 
documents) is being displayed in addition to the standard 
text surrogates such as title and source. 

Effectiveness of the visualization tool was measured by 
what the subjects optimized upon: time, accuracy and the 
combined time-accuracy rank, where accuracy is the number 
of correct judgments minus the number of incorrect judg- 
ments after discarding the Unsure judgments, i.e., Accuracy 
= RuRt+NuNt-RuNt-NuRt. However, since the accuracy 
measure includes the correct judgments, Type I errors and 
Type II errors all in one score, we split the accuracy measure 
into distinct components. Here we borrow the analogs of two 
traditional 1R measures "recall" and "precision" and extend 
them to the interactive situation. In the traditional recall 
and precision measures, the number of documents that the 
system judges to be relevant is artificially determined by a 
cut-off point of top 'X' documents. Let RsRt be the number 
of documents judged relevant by the system and relevant by 
the TREC assessor (the user with the original information 
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need). Let RsNt be the number of documents judged rele- 
vant by the system and non-relevant by the TREC assessor. 
Let NsRt be the number of documents judged non-relevant 
by the system and relevant by the TREC assessor and. Let 
NsNt be the number of documents judged non-relevant by 
the system and non-relevant by the TREC assessor. 

While traditional "Recall** refers to the ratio of truly 
relevant documents that the system judged as relevant (i.e., 
RsRt/(RsRt + NsRt)), we define "Interactive Recall* as the 
ratio of the truly relevant documents that were judged as rel- 
evant by the user (i.e., Interactive Recall = RuRt/(RuRt + 
NuRt + UuRt)). While traditional M PreciBion' , refers to the 
ratio of documents judges as relevant by the system that 
were truly relevant (i.e., RsRt/(RaRt + RsNt)), we define 
"Interactive Precision" as the ratio of the documents judged 
as relevant by the user that were truly relevant (Interactive 
precision = RuRt/(RuRt + RuNt)). Here, a "truly rele- 
vant'' document is a document that was judged relevant by 
the TREC assessor. Thus, if we are trying to build an ef- 
fective first stage display mechanism, we would strive for a 
display mechanism which would enable a user to pick (and 
read the full-text of) all of the relevant documents and only 
the relevant documents displayed. When a user picks a non- 
relevant document as relevant, it would be time and money 
wasted perusing a non-relevant document. As a corollary, 
not being able to pick a relevant document, would be a miss- 
ing out on relevant information. 

However, "Unsure" documents pose a problem. It can 
be handled in two ways: If we assume that a user always 
reads the full text of an Unsure document, we should treat 
the Unsure documents as being judged relevant by the user. 
Conversely, if a user always skips over an Unsure document, 
we should treat the Unsure document as being judged non- 
relevant by the user. Below, we present the analysis with 
both the interpretations. Thus, if we assume the user to 
inspect the Unsure documents, we treat the Unsure docu- 
ments as relevant. 

Interactive Recall = (RuRt + UuRt) / (RuRt + NuRt + 
UuRt) 

Interactive Precision = (RuRt + UuRt) / (RuRt + UuRt + 
RuNt + UuNt) 

If we assume the user to not inspect the Unsure documents, 
we treat the Unsure documents as not-relevant, 
Interactive Recall = RuRt / (RuRt + NuRt + UuRt) 
Interactive Precision = RuRt / (RuRt + RuNt) 

In summary, our hypotheses are: 

• Visualization will not increase the time taken to com- 
plete the relevance judgment task. 

• Visualization will improve the Accuracy of relevance 
judgments. 

• Visualization will improve Interactive Recall. 

• Visualization will improve Interactive Precision. 

5 Results 

Statistical analysis of the experimental data empirically shows 
that our hypotheses about the relevance judgment task are 
valid. Since there were 37 subjects, and all subjects did 6 
topics with 4 tasks (for each of the 4 groups within the topic) 
per topic, there were a total of 37 x 6 x 4 = 888 observa- 
tions. The approach used in all analyses was to construct 
a least squares, linear additive model of each performance 



measure as a function of the main effects and interactions 
of the manipulated experimental variables. 

The need for consideration of passible learning/ordering 
effects, due to the same subjects providing multiple responses 
at various experimental conditions, is minimized by the bal- 
ancing of the order in which different experimental condi- 
tions are presented to the subjects. However, due to the 
requirement that within a topic, the high precision condi- 
tion always be presented first, this balance could not be 
achieved for this factor. To account for this, the model 
included a term representing the observation order within 
subject/topic combination. The design thus allows for in- 
dependent estimation of all effects except precision and ob- 
servation order. The analysis presented will focus on the 
statistical significance of each term assuming the presence 
of the the other term in the model (i.e on the adjusted sums 
of squares in the Analysis of Variance (ANOVA) tables), as 
this provides evaluation of the marginal effect. 

The residuals of the models constructed were analyzed 
to assure reasonable compliance with the normality, inde- 
pendence and constant variance assumptions required for 
validity of ANOVA, 

For the dependent variable "time", the residuals indi- 
cated a higher variance for conditions resulting in larger 
values of time, and hence we transformed time values into 
log\o(time in seconds) to check for statistical significance. 
The ANOVA tables for logio(time) t accuracy and final score 
are shown in Tables 1, 2 and 3 respectively. The means and 
standard errors are shown in table 4. As can be seen from 
the tables, viz is significantly better than noviz for logtime, 
accuracy and final score. It is also clear that low precision 
condition does significantly better than high precision for 
logtime, accuracy and final score. The interaction effects of 
precision and visualization are shown in figures 2, 3 and 4 
with a 95% confidence interval around the means. When 
precision is high, visualization does not significantly affect 
logtime, but when precision is low, there is a decrease in 
logtime of 0.08. This corresponds to a reduction of 17.2 sec- 
onds, nearly a 20% decrease in average time required. Thus 
we can conclude that the visualization tool helps users in 
identifying document relevance more quickly. It is also in- 
teresting to note (from Table 1) that the interaction effect 
of topic with visualization was not statistically significant, 
although the main effect of topic was significant. Thus, vi- 
sualization helps improve speed of judgment irrespective of 
topic. 

For the accuracy measure, there is no significant inter- 
action between precision and visualization as shown by the 
almost-parallel lines in figure 3. Precision has a huge im- 
pact on accuracy, again consistent with previous studies 
[Sar69, MKB78]. While the effect of visualization on accu- 
racy is significant, it is not as huge as the effect of precision. 
Users can identify document relevance more accurately with 
the visualization tool than without. The ability of users 
to identify non-relevant documents as non-relevant is much 
higher than their ability to identify relevant documents as 
relevant. This is reflected in the significantly very high ac- 
curacy value for low precision than for high precision. It is 
also interesting to note that (from Table 2) the interaction 
between topic and visualization was statistically significant. 
However, the main effect of visualization was much greater 
than the topic*vi» interaction effect. 

Final score is a rank measure, which reflects the users 
ability to accurately and quickly identify document relevance. 
It is plotted in figure 4. Lower values are better for final 
score. As with accuracy, precision has a much higher impact 
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than visualization, but both variables have a significant ef- 
fect. Visualisation tool improves Final Score and so does 
low precision. There is a higher proportion of non-relevant 
documents in the low precision condition. This implies that 
users can more quickly and accurately judge a non-relevant 
document as non-relevant compared to judging a relevant 
document. It is also interesting to note (from Table 3) that 
the interaction between topic and visualization was statis- 
tically significant. However, the main effect of visualization 
was much greater than the topic* viz interaction effect. 



Table 4: Least Square Means and Standard errors for Log- 
tim e, Accuracy and Final score 



Precis 



Low 
Low 
High 



Without 

With 
Without 
With 

ERR OP EST 



Logtimc Accur FinScor 



2.01 
1.93 
2.04 
2.04 



15.72 
17.54 
5.72 
6.87 



"o55§ 03T 



353.2 
288.4 
576.1 
560.2 



9.4 



Source 


DF 


AdjSS 


Adj MS 


F 


P 


topic 


5 


1.59993 


0.31999 


21.28 


0.000 


precis 


1 


0.45954 


0.45954 


30.56 


0.000 


viz 


1 


0.36761 


0.36761 


24.44 


0.000 


precis* viz 


1 


0.29215 


0.29215 


19.43 


0.000 


topic*viz 


5 


0.15612 


0.03122 


2.08 


0.067 




Table 2: ANOVA for Accuracy. 




Source 


DF 


AdjSS 


Adj MS 


F 


P 


topic 


5 


10566.13 


2113.23 


95.04 


0.000 


precis 


1 


11842.00 


11842.00 


532.55 


0.000 


viz 


1 


490.54 


490.54 


22.06 


0.000 


precis* viz 


I 


24.67 


24.67 


1.11 


0.293 


topic*viz 


5 


1248.65 


249.73 


11.23 


0.000 



ZM 




Source 



topic 

precis 

viz 

precis*viz 
topic*viz 



Adj SS AcU 1 

1437538 
6789177 
362841 
133133 
70534 



6789177 
362841 
133133 
352669 



F 

WW 

429.68 
22.96 
8.43 
4.46 



0.000 
0.000 
0.000 
0.004 
0.001 



As discussed before, accuracy combines the following 
four items into one: ability to judge relevant and non-relevant 
documents (RuRt + NuNt), type I error, i.e., wrongly re- 
jecting relevant documents, and type II error, i.e., wrongly 
accepting non-relevant documents. We feel that identifying 
non-relevant documents (NuNt) in and of itself is not as 
important as the other 3 items. For, it is important 

• to minimize Type I errors, or else one runs the risk of 
missing out too many relevant documents. . 

• to minimize type U errors, or else one runs the risk 
of wasting too much money and effort in examining 
non-relevant documents. 

We can capture all the interesting data with interactive re- 
call and interactive precision as described in the previous 
section. In our tables, when users are assumed to treat un- 
sure documents as relevant, the interactive precision and 
interactive recall are denoted by "iprecwu" and "irecwu" 
respectively. Correspondingly, when unsure documents are 
assumed to be treated as non-relevant, interactive precision 



Figure 2: Interaction effects of precision and visualization 
on logtime. 
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Figure 3: Interaction effects of precision and visualization 
on accuracy. 
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cision when Unsure documents were treated as non-relevant 
(iprecwou) at the 0.05 level, however, it was significant when 
Unsure documents were treated as relevant (iprecwu) (See 
figure 5). Although statistically significant, the absolute in- 
crease in interactive precision is very minimal (about 0.015). 
However, visualization had a significant effect on interactive 
recall (both when unsure documents were treated as non- 
relevant (irecwou) and when unsure documents were treated 
as relevant (irecwu)). Also, in the absolute sense, the im- 
provement in interactive recall due to visualization is ap- 
proximately 0.07 -f /- 0.02 (about a 15% increase). Clearly 
this is of sufficient magnitude to be of practical importance. 



Table 5: ANOVA for Interactive Precision "iprecwou" (Un- 
sure documents treated as non-relevant) 



Source 


DF 


Adj SS 


Adj MS 


F 


P 


topic+ord 


5 


8.15469 


1.63094 


163.81 


0.000 


viz 


1 


0.03065 


0.03065 


3.08 


0.081 


topic+ord*viz 


5 


1.70775 


0.34155 


34.31 


0.000 



Figure 4: Interaction effects of precision and visualization 
on Final Score. 



and interactive recall are denoted by the mnemonics Iprec- 
wou* and "irecwou" respectively. 

In considering the interactive precision measure there 
are a large number of cases where the values result in re- 
sponses of zero divided by zero when users did not pick any 
of the displayed documents as relevant. Rather than elim- 
inate these cases, the raw data (i.e., RuRt, RuNt, NuRt, 
NuNt, UuRt, UuNt) was aggregated over high and low pre- 
cision levels for the same viz condition and the interactive 
precision and interactive recall measures then computed. 
Thus, for example, for topic 77, the RuRt values for the 
high.precision.viz case for subject 1 was added to the RuRt 
value of the low .precision, viz case of the same subject 1 and 
same topic 77. Now we end up with 444 observations in- 
stead of the original 888 observations. This eliminated the 
need for the "precision" term in the model, although the 
variability due to this factor is included in the error term. 
One of the terms is labeled "topK+ord" because the "topic" 
term also includes some "condition order" effects since for 
different topics, the four conditions appeared in different 
orders. The design is now orthogonal to the remaining fac- 
tors. However for interactive precision when unsure docu- 
ments are considered non-relevant (iprecwou), there remain 
2 cases where the response variable is still zero divided by 
zero. The result is a design where estimated effects are min- 
imally dependent. Also, there are some quantization errors 
introduced in the interactive precision measure due to the 
denominator value being too close to zero 3 . The statistical 
significance of visualization for Interactive precision and in- 
teractive recall (with unsure documents treated as relevant 
and non-relevant) are shown in tables 5, 6, 7 and 8, and 
table 9 shows the estimated means. 

Visualization had no significant effect on interactive pre- 

2 For interactive precision when unsure documents are considered 
non- relevant (iprecwou), there were 2 cases where the denominator 
had a value of 1, 5 cases of value 2, 6 cases of value 3. For interactive 
precision when unsure documents are considered relevant (iprecwu), 
there were 0 cases of denominator values 0 and 3 t 1 case of values 1 md 
2. Given that there were 444 observation points, these quantisation 
errors are not expected to distort the results much. 



Table 6: ANOVA for Interactive Precision "iprecwu" (Un- 
sure documents treated as relevant) 

Source | DF Adj SS Adj MS F P 

topic+ord 5 6.90892 1.38178 180.62 0.000 

viz 1 0.04166 0.04166 5.45 0.021 

topic+ord*viz 5 1.02194 0.20439 26.72 0.000 



Table 7: ANOVA for Interactive Recall "irecwou" (Unsure 
documents treated as non-relevant) 



Source 


DF 


Adj SS 


Adj MS 


F 


P 


topic+ord 


5 


3.04486 


0.60897 


34.92 


0.000 


via 


1 


0.62601 


0.62601 


35.89 


0.000 


topic+ord*viz 


5 


0.72200 


0.14440 


8.28 


0.000 



6 Conclusions 

We have presented a visualization tool designed to be an 
effective first stage display of retrieved documents. The 
results about the query reformulation task and a detailed 
analysis of all the experimental factors can be found in the 
thesis by Veerasamy [Vee97]. User experiments empirically 
show that when precision is low, the visualization tool helps 
users in identifying document relevance quicker by about 
20%. Our hypothesis was that the time taken to judge rel- 
evance would not be higher for visualization because we 
claimed that graphically displaying additional information 
would not take additional time to peruse by enabling set-at- 
a-time perusal. While this argument is certainly validated 
by the experimental results, we however see that visualiza- 
tion seems to decrease the time taken. We see only one 
explanation to this: Users consult visualization before they 
consult the titles, thereby not looking at the titles of those 
documents which are clearly non-relevant. Thus they save 
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Table 8: ANOVA for Interactive Recall "irecwu" (Unsure 



Source 


DF 


AcUSS 


A4JMS 


F 


P 


topic -hord 
viz 

topic+ord*viz 


5 
I 
5 


2.35410 
0.42787 
0.41879 


0.47082 
0.42787 
0.08376 


30.21 
27.46 
5.37 


0.000 
0.000 
0.000 



Table 9: Least squares means of iprecwou, iprecwu, irecwou, 



viz 


iprecwou 


iprecwu 


irecwou 


irecwu 


Without 


0.6117 


0.5753 


0.4454 


0.5484 


With 


0.6284 


0.5947 


0.5209 


0.6108 


Std error 


0.007 


0.006 


0.009 


0.008 
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Figure 5: Effect of visualization on interactive precision 
(when Unsure documents are treated as relevant and non- 
relevant documents). 
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Figure 6: Effect of visualization on interactive recall (when 
Unsure documents are treated as relevant and non-relevant 
documents). 



the time needed to read titles for those non-relevant doc- 
uments. This is in agreement with the study by Saracevic 
[Sar69] which shows that minimal information is needed to 
say that a document is non-relevant. However, to say that 
a document is relevant, much more information is needed. 
This is also confirmed by the fact that the magnitude of 
time-decrease due to visualization is much higher in the low 
precision condition than in the high precision condition. On 
the whole we see confirmation of our argument about set- 
at- a- time perusal of documents in graphical displays. 

The experiment also shows that users with the visual- 
ization tool did significantly better in accurate (both in 
terms of the aggregate "Accuracy" measure and in terms of 
the broken down measure of "Interactive Recall") identifica- 
tion of document relevance. The result about the influence 
of precision over relevance judgment Accuracy is in agree- 
ment with previous studies by Saracevic [Sar69], and Mar- 
cus et.al. [MKB78]. Their studies, like ours, also show that 
users are better able to judge non-relevance than relevance. 
However we do not see an interaction between precision and 
visualization on Accuracy. Thus visualization seems to help 
increase Accuracy to the same extent irrespective of the den- 
sity of relevant documents. There is a marked difference in 
a user's ability to judge the relevance of relevant documents 
and non-relevant documents. Given this difference, we feel 
that precision (i.e., the density of relevant documents among 
the displayed documents) should be a variable that must be 
controlled in experiments that measure a user's ability to 
judge relevance. Further, care should be taken in making 
claims purely based on a compound measure such as "Ac- 
curacy" that combines both the ability to correctly identify 
relevant documents and the ability to correctly identify non- 
relevant documents. 

We broke down the accuracy measure into two compo- 
nents: interactive precision and interactive recall to gain 
a better understanding of the relevance judgment process. 
While the effect of visualization tool was marginally signif- 
icant for interactive precision, it was highly significant for 
interactive recall. Thus, we can safely say that the visualiza- 
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ABSTRACT 

Current user interfaces of full text retrieval systems do 
not help in the process of filtering the result of a query, 
usually very large. We address this problem and we 
propose a visual interface to handle the result of a query, 
based on a hybrid model for text. This graphical user 
interface provides several visual representations of the 
answer and its elements (queries, documents, and text), 
easing the analysis and the filtering process. 

Keywords: visual browsing, visual text 
databases, visual tools, visual representations, 
visual query languages, set visualization, vi- 
sual analysis. 

1 Introduction 

Full text retrieval systems are a popular way of provid- 
ing support for on-line text. Their advantage is that 
they avoid the complicated and expensive process of se- 
mantic indexing. From the end-user point of view, full 
text searching of on-line documents is appealing because 
a valid query is just any word or sentence of the doc- 
ument. However, there is no standard query language 
despite the wide range of features provided by commer- 
cial systems. 

On the other hand, traditional information retrieval 
(IR) systems have a formal foundation based on early 
library applications and other well-structured textual 
databases [29, 30]. However, those foundations are not 
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suitable for texts without a fixed structure or no struc- 
ture at all because they assume that a text is based 
on words and documents. In hypermedia or genetic 
databases those assumptions are not valid. 

Querying is just one part of the semantic process. 
The another part is to select the document that you 
are looking for. Many researchers in IR have pointed 
out the problems concerned with the user's understand- 
ing of the system due to poor interfaces (for example, 
see [9, 17]). We present yet another visual interface to 
browse over queries, documents and text structure and 
contents. This interface is based on an hybrid model for 
textual databases [3, 4], based on the classic IR model 
and the model used by the PAT text searching system 
[11, 28], which sees the text as a sequence of characters 
and no predefined text structure. This model is flexible, 
extensible, powerful and rather general. 

Although recently there has been several papers deal- 
ing with visualization of databases, our approach cap- 
tures previous work with several new ideas. We present 
a set of visual tools that allow query manipulation and 
document analysis and browsing. The main issue is how 
to visualize large sets of documents. The same ideas can 
be applied to large sets of files or network addresses, 
present as results iivrnany operating systems or Inter- 
net tools. A graphical query language based on the card 
paradigm has been already developed and included in a 
commercial product [1], but it is not included here, and 
is presented in a forthcoming paper [6]. 

We first summarize previous work on the topic, fol- 
lowed by the description of the text database model 
used. The main section present our ideas for a visual 
interface to handle queries, sets of documents and their 
contents, as well as a visual analysis tool. The ideas pre- 
sented here are the synthesis of the author experience 
in several software projects related to full text retrieval 
systems and user interfaces [13, 2, 25, 1]. Although some 
of the ideas presented here are not new, we believe that 
the relevance of the paper is to look at. them in an inte- 
grated manner and within the context of large full-text 
databases. 
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2 Previous Work 

Most visual representations focus on some specific as- 
pects. In text retrieval we can distinguish visualizations 
for a single document, several documents or queries. 
Most of the time only one of those elements is visu- 
alized. In the last years, several visual metaphors have 
been designed. Below we present some of them. 

A general user interface framework and interaction 
is presented in the InfoGrid [23]. In [18] and [8] the 
semantic organization is addressed, which in our case 
would be just one view of the document space. The 
VIBE system [22] also focuses on the document space, 
but it is based on user given points of interest of the 
query (using weighted attributes). Another metaphor 
for the document space based on inter-particle forces, 
as VIBE, is proposed in [7]. In [32] the document space 
is abstracted from a Venn diagram to an iconic display 
called InfoCrystaL One advantage of this scheme is that 
is also a visual query language. Visual tools in three- 
dimensions to handle the document space are presented 
in LyberWorld [15]. A more integrated visual scheme 
is given in [12], which is based on the query structure. 
Visualization of occurrence frequency of terms in differ- 
ent text segments of a document is presented in [14]. 
Specific visualizations applied to text are presented in 
[24, 10]. More powerful visualizations are available, in- 
cluding three-dimensional visualization, but they need 
fast hardware [26]. 

3 Text and Query Model 

Let text be the data to be searched. This data may be 
stored in one or more files. It is not necessarily textual 
data, just a sequence of characters. When the text is 
large, fast searching is provided by building an index 
of the text, which is used in subsequent searches. We 
assume that a searching engine of such kind is available. 
This engine implements a given query language which is 
the interface to ours. To improve retrieval capabilities 
and to simplify posing the query, the index is built over 
a normalized text. Text normalization is achieved by 
processing the original text using a user-defined set of 
transformations which include stop words, synonyms, 
character suppression, character translation, etc. 

Depending on the retrieval needs, the user must de- 
fine which positions of the text will be indexed. Every 
position that must be indexed is called an index point. 
The index points are specified over the normalized text. 
In addition to the index points, we may specify pieces 
of the text that will not be indexed (for example, if a 
text contains images or non-indexable data). 

The search for a word or prefix returns all pieces of 
text (matches or occurrences) matching it in the index 
(that is, text positions that are index points). Each oc- 
currence is an index point plus its length used to high- 



light the piece of text that matches the query. Note 
that only pieces of text starting at an index point can 
be retrieved by a query, and thus only those can be oc- 
currences. The answer and the original text are used by 
the user interface to display the actual matches. 

Optionally, a text may have a structure. This struc- 
ture can and must be defined by the application pro- 
grammer or the user. The text can be divided into 
documents. The text itself may be considered as one 
document. Each document may be divided into fields. 

We formally define a query as an operation over the 
text that returns two types of objects: 

1. a set of occurrences identified by their position in 
the text and the length and scope of the text that 
matches, and 

2. a set of files (if the text has no structure) or a set of 
logical documents (if the user defines a structure), 

When there is no structure, a query returns at least the 
name of the file that stores the text. The first conse- 
quence of this definition is that boolean operators are 
necessary for two different scopes: sequences of symbols 
(occurrences) and the physical or user-defined structure. 
Operations to give the subset of occurrences that belong 
to a given document are also necessary. For more de- 
tails on the text model and query language see [3, 4]. A 
comparison between different models that allows queries 
also in the text structure, including the model used in 
this paper, is given in [5]. 

A formal description of the result of a query Q is a 
set of documents (or files) D — {d Xi d ri ] and a set of 
text positions (matches) M = {m u ...,m p } with p > n. 
Each document has a set of attributes A = {a\ } ai). 
We assume that document attributes can be ordered 
and have a minimum and maximum value. Possible 
attributes are: logical or physical position (identifier), 
temporal values (creation or last modification date), 
document size, etc. 

Our visual interface models the browsing and filtering 
process with these three elements: queries, documents 
(possibly with their structure) and text positions. Visu- 
alization of queries allows to follow the history of the fil- 
tering process. Visualization of the result of each query 
(documents and text positions) allows to understand the 
result and facilitates the semantic process done by the 
user. 

4 Visual Browsing 

We associate to each element (queries, documents and 
text positions) one or more views which are related to 
at least one measure (numerical attribute) , which is se- 
lected by the user. An example of the user interface is 
given in Figure 1. The screen is divided vertically into 
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the three elements from right to left: queries, docu- 
ments, and occurrences. We describe them in this order 
and after we give some examples of their use as filtering 
tools. 

The user interaction is very simple. The user chooses 
the desired view for each element. By clicking in a query, 
the document and text view change to that query an- 
swer. By clicking in a document the text view shows 
the content of it. 

4.1 Query Visualization 

Queries are arranged from top to bottom (the most re- 
cent on top), with the current query highlighted. In the 
example we show three different views of each query. 
The pie view is based on the measure given by the 
number of documents of the database selected. Other 
views are obtained by pointing to the query. We show 
two of them as smaller windows on Figure 2(c). The 
one above, shows the distribution of occurrences within 
terms of the query (query map) using boxes of different 
sizes. This view is useful to know what terms are the 
best filters in the given query. Below, we show the dis- 
tribution of documents selected on the database logical 
space, that is, the underlying document identifier map- 
ping (universe map). This view can show if there is any 
logical locality of reference associated with the query. 

More views are possibly, but depend on the specific 
query language. Also, a specific view (concept) may 
have more than one meaningful representation. How- 
ever, the examples above show what is common to all 
text retrieval queries: 

• Query answer size with respect to the universe. 

• Query elements "weight". 

• Locality of reference in the answer. 

4.2 Document Visualization 

The central portion of the screen shows the document 
space selected by the query. There are several mea- 
sures that can be associated with these documents. For 
example, number of occurrences in a document or any 
document attribute. In the example we have two, which 
generates two different views. The top view is the doc- 
ument space viewed as a non-uniform grid displayed as 
a fish- eye 1 , where the grid focus changes as a pointer 
device moves (say a mouse) and the current document 
is selected with a mouse button. The fish-eye concept is 
only used if the number of documents does not fit within 
the resolution and size of the current document window. 
In this view each document is represented in a different 
gray scale or color, according to a document attribute. 

^his technique has been used successfully in other applica- 
tions to show large objects, for example graphs [20]. 



In the figure example we use the number of occurrences 
of the query terms in it. The current document is high- 
lighted, and shows the current value of its attribute. 
In fact, this is an example of the more general concept 
of assigning a visual mapping to document attributes. 
The following visual mappings are possible, and are the 
buttons labeled on the bottom of the document space 
in Figure 1. 

• Order: the order of the documents can be changed 
by a given attribute (provided that it has a total 
order). The order is taken from top to bottom, and 
left to right. 

♦ Color: the color or gray scale is associated with the 
attribute. The number of color or gray scales can 
be automatically generated to a given number of 
levels (default or specified by the user) . 

* Size: the size of the square (vertically) can also 
be variable and associated with an attribute. If 
we want to use both dimensions, a possible way 
to do this can be based on tree-maps [31], which 
depicts trees as squares of different size, but it is 
less efficient. 

For example, with the order and color buttons we can 
sort the documents by number of occurrences (this 
might be useful if we know in advance that the query 
appears many -dark color- or few -white color- times 
on the document). 

The previous view does not use the fact that the 
screen has two dimensions. Another view uses two 
dimensions, and is based on the occurrences and the 
document structure (if exists). The vertical axis (fish- 
eyed) represents documents and the horizontal axis their 
structure (see Figure 2(b)). Every field is displayed ac- 
cording to its relative size inside the document and has 
a bar which depends on a given measure for each field 
("F" button). Again, the document space can be sorted 
with a different attribute on an specific field instead of 
document order. 

The examples above show what is generic to docu- 
ments. Each document typically has several attributes 
as well as values that depend on the query, which can 
be used with a visual mapping. Default values would 
be logical identifier for the order, number of occurrences 
for the color, and uniform size. Some of these attributes 
can be applied to the structure, for example size of an 
specific field or density of occurrences. 

4.3 Text Visualization 

The measure associated with occurrences is their posi- 
tion with respect to the whole database or to a given 
document. The later is used in the example. One pos- 
sible view, as shown, is a window of the document with 
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the text positions highlighted. The window has an aug- 
mented scroll-bar (similar to [21] but in a different con- 
text) which has marks where the text positions appear 
in the document. These marks .may depend on actual 
positions, density, etc; and they can have variable width 
and/or length. 

The scroll-bar can be viewed as a complete compact 
view of the text. Clicking in any position of the scroll- 
bar focus the text view in that part of the document. 
If the document is very large, the scroll-bar can also 
be fish-eyed with indications of where the nearby oc- 
currences are. The same ideas can be generalized to 
provide different granularities of the text view: 

• The text itself is fish-eyed zooming where the query 
occurs given some adjacent lines to understand the 
context (see Figure 2(a) left). The number of lines 
can be modified by the user. 

• Only the text layout is given, in multiple columns, 
as in [10] (see Figure 2(b) right). Colored (darker) 
parts indicate lines where the query occurs. 



5 Answer Filtering and Selection 

In this section we present a more elaborate metaphor 
for manipulating and filtering an answer given by a set 
of documents. Figure 3 shows an instance of the vi- 
sual analysis tool that we propose for advanced users. 
We use a "library" or "bookpile" analogy depending if 
the tool is used horizontally or vertically, because both 
are possible. The "pile" metaphor has been used be- 
fore, but in a different way and for different purposes 
(for example see [27, 16]). Each document (seen as a 
book) is represented as a rectangle with a particular 
color, height, width and position into the set. Each one 
of this graphical attributes, including the order of the 
list, can be mapped to a document attribute (occurrence 
density, size, date, etc). In the example, the order and 
the color are mapped to the same attribute (for exam- 
ple, the creation date). These mappings allow to study 
.different correlations of attributes on the document set, 
helping the user to select the desired documents. A se- 
lect button allows to choose a document subset by using 
the mouse (the wide border rectangle in the example). 
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Universe map 

Figure 2: Other views. 
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Figure 3: Analyzing and selecting a document set. 



The mapping of the attributes is selected by the menu 
buttons below the book list. The mapping can be done 
linearly or logarithmicly (in the case of attributes with 
large scales). The way the books are seen can also be 
changed. The set of documents can be forced to fit into 
the window (as in the example), presented using a pre- 
defined choice of maximal/minimal widths and heights 
(using a scroll-bar if bigger than the window). Another 
view is a fish-eye representation for large sets, focusing 
where the user wants (by clicking with the mouse the 
appropriate sector). 

6 Concluding Remarks 

The visual tool interface presented here should be con- 
sidered just as different views for the same data. The 
main idea is that the user chooses the view which is most 
suitable for his/her semantic process or filtering needs. 
That depends on what the user has in mind, his/her 
knowledge of the database, and the type of application. 

Some of the views presented for the query level can be 
adapted for a visual query language or to allow direct 
manipulation of the query from the visual representa- 
tion. They can also be used to associate weights to each 



element of the query. At the document or text level, a 
possible extension is to allow user defined markers cho- 
sen from a set of standard marks which may convey 
different type of annotations and/or degree of interest. 

We are currently working in a prototype of this inter- 
face using HotJava, which includes the views given in 
the example, as well as other views and tools. In par- 
ticular, visual tools are a must considering the amount 
of text data currently available in Internet. We want 
to extend this interface to related tasks on distributed 
and collaborative text databases. We also want to in- 
clude more elaborate queries which will include also the 
text structure (for example in SGML), using the query 
language that we proposed in [19]. 

The visual representations presented allow to perform 
several different correlations between the documents of 
the result. Those correlations help the semantic analysis 
of the user. For example, we could correlate up to four 
attributes of documents by choosing the order, color 
and size of the document space representation. So, we 
can see locality related to date with respect to number 
of occurrences or similar relations, depending on partial 
information already known by the user but not easy to 
represent in the query language. That is a key issue: 
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some knowledge may difficult to formalize, but easier to 
visualize. 

The usefulness of the interface presented must have 
several usability tests to obtain user feed-back and to 
improve its design. The interface is not only useful for 
classical text retrieval systems, but can be applied to 
many other tools which have to represent data struc- 
tured in two levels. Some examples follow: 

• Large file systems: for example a file specification 
as "*.c" or a grep query can be displayed similarly. 
Documents in this case are files. 

• Network addresses plus files: for example global 
searches using Archie (sources for public software) 
or in WWW (URLs). 
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